less than 4. Having 30 events — which in this case are fatal car accidents — isn’t statistically
significantly different from having 40 events in the same time period. As you see from the result, the
increase of 10 in one year is likely statistical noise. But had the number of events increased more
dramatically — say from 30 to 50 events — the increase would have been statistically significant.
This is because
, which is greater than 4.
Estimating the Required Sample Size
As in all sample-size calculations, you need to specify the desired statistical power and the α level of
the test. Let’s set power to 80 percent and α to 0.05, as these are common settings. When comparing
event rates (
and
) between two groups with
as the reference group, you must also specify:
The expected rate in the reference group (
)
The effect size of importance, expressed as the rate ratio
The expected ratio of exposure in the two groups
For example, suppose that you’re designing a study to test whether rotavirus gastroenteritis has a
higher incidence in City XYZ compared to City ABC. You’ll enroll an equal number City XYZ and
City ABC residents, and follow them for one year to see whether they get rotavirus. Suppose that the
one-year incidence of rotavirus in City XYZ is 1 case per 100 person-years (an incidence rate of 0.01
case per patient-year, or 1 percent per year). You want to have an 80 percent likelihood of getting a
statistically significant result assuming p = 0.05 (you want to set power at 80 percent and α = 0.05).
When comparing the incidence rates, you are only concerned if they differ by more than 25 percent,
which translates to a RR of 1.25. This means you expect to see 0.01 × 1.25 = 0.0125 cases per patient-
year in City ABC.
If you want to use G*Power to do your power calculation (see Chapter 4), under Test family, choose z
tests for population-level tests. Under Statistical test, choose Proportions: Difference between two
independent proportions because the two rates are independent. Under Type of power analysis,
choose A priori: Compute required sample size – given α, power and effect size, and under the Input
Parameters section, choose two tails so you can test if one is higher or lower than the other. Set
Proportion p1 to 0.01 (to represent City XYZ’s incidence rate), Proportion p2 to 0.0125 (to represent
City ABC’s expected incidence rate), α err prob (α) to 0.05, and Power (1-β err prob) (power) to 0.8
for 80 percent, and keep a balanced Allocation ration N2/N1 of 1. After clicking Calculate, you’ll see
you need at least 27,937 person-years of observation in each group, meaning observing 57,000
participants over a one-year study. The shockingly large target sample size illustrates a challenge when
studying incidence rates of rare illnesses.